Speech synthesis without a phone inventory

نویسندگان

  • Matthew P. Aylett
  • Simon King
  • Junichi Yamagishi
چکیده

In speech synthesis the unit inventory is decided using phonological and phonetic expertise. This process is resource intensive and potentially sub-optimal. In this paper we investigate how acoustic clustering, together with lexicon constraints, can be used to build a self-organised inventory. Six English speech synthesis systems were built using two frameworks, unit selection and parametric HTS for three inventory conditions: 1) a traditional phone set, 2) a system using orthographic units, and 3) a self-organised inventory. A listening test showed a strong preference for the classic system, and for the orthographic system over the self-organised system. Results also varied by letter to sound complexity and database coverage. This suggests the self-organised approach failed to generalise pronunciation as well as introducing noise above and beyond that caused by orthographic sound mismatch.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetically enriched labeling in unit selection TTS synthesis

Unit selection techniques have improved the quality of textto-speech (TTS) synthesis. However, mistakes which had been less noticeable previously in poorer quality synthetic speech become very noticeable in more natural-sounding synthetic speech. Many problems appear to be caused by mismatches between phones requested by the TTS frontend and phones selected from the labeled speech inventory. Gi...

متن کامل

Phrase splicing and variable substitution using the IBM trainable speech synthesis system

This paper describes a phrase splicing and variable substitution system which offers an intermediate form of automated speechproduction lying in-between the extremes of recorded utterance playback and full Text-to-Speech synthesis. The system incorporates a trainable speech synthesiser and an application specific set of pre-recorded phrases. The text to be synthesised is converted to a phone se...

متن کامل

Hybrid syllable/triphone speech synthesis

In this paper, the syllable, an alternative phonetic unit to the phone, is researched in the context of speech synthesis. Several approaches to syllable modelling within the statistical approach (using hidden Markov models) to the acoustic unit inventory creation are proposed and evaluated. To be able to synthesize an arbitrary text, the syllable inventories were supplemented with triphones res...

متن کامل

Construction of the acoustic inventory for a Greek text-to-speech concatenative synthesis system

The development of the Greek Text-To-Speech (TTS) system by NTUA is based on the method of concatenative synthesis and follows the Bell Labs approach to this technique. Concatenative synthesis is one of the simplest methods for speech synthesis and at the same time bypasses most of the problems encountered by articulatory and formant synthesis techniques. The method relies on designing and crea...

متن کامل

Synthesizing fast speech by implementing multi-phone units in unit selection speech synthesis

This paper presents a new approach to synthesizing fast speech in unit selection synthesis. After recording two inventories one at normal and one at fast speech rate articulated as accurately as possible speech was synthesized from both corpora independently. Since fast speech differs from normal rate speech in terms of acoustic characteristics, the concept of multi-phone (phoxsy) units propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009